The Collective Data Mining : a Technology for Ubiquitous Data Analysis from Distributed Heterogeneous Sites
نویسندگان
چکیده
This paper introduces the collective data mining (CDM), a unique approach to distributed data mining (DDM) from heterogeneous sites. It points out that naive approaches to distributed data analysis in a heterogeneous environment may face ambiguous situation and may lead to incorrect global data model. It also observes that any function can be expressed in a distributed fashion using a set of appropriate basis functions and orthonormal basis functions can be effectively used for developing a general framework for DDM that guarantees correct local analysis, resulting in correct global data model using minimal data communication. The paper develops the foundation of CDM, presents a case study for decision tree learning in CDM, and brieey describes BODHI, a CDM based experimental system.
منابع مشابه
Clustered Collaborative Filtering Approach for Distributed Data Mining on Electronic Health Records
Distributed Data Mining (DDM) has become one of the promising areas of Data Mining. DDM techniques include classifier approach and agent-approach. Classifier approach plays a vital role in mining distributed data, having homogeneous and heterogeneous approaches depend on data sites. Homogeneous classifier approach involves ensemble learning, distributed association rule mining, meta-learning an...
متن کاملCollective Data Mining: A New Perspective Toward Distributed Data Mining
This paper introduces the collective data mining (CDM), a new approach toward distributed data mining (DDM) from heterogeneous sites. It points out that naive approaches to distributed data analysis in a heterogeneous environment may face ambiguous situation and may lead to incorrect global data model. It also observes that any function can be expressed in a distributed fashion using a set of a...
متن کاملDistributed Incremental Least Mean-Square for Parameter Estimation using Heterogeneous Adaptive Networks in Unreliable Measurements
Adaptive networks include a set of nodes with adaptation and learning abilities for modeling various types of self-organized and complex activities encountered in the real world. This paper presents the effect of heterogeneously distributed incremental LMS algorithm with ideal links on the quality of unknown parameter estimation. In heterogeneous adaptive networks, a fraction of the nodes, defi...
متن کاملCollective Data Mining from Distributed, Vertically Partitioned Feature Space
This paper develops collective data mining, a unique approach for nding patterns from a network of databases, each with a distinct feature space. This paper addresses both distributed cooperative learning at the global level and also learning at the local data sites. In addition to developing the foundation of the collective data mining , it also presents BODHI, a distributed data mining (DDM) ...
متن کاملDistributed Multivariate Regression Using Wavelet-Based Collective Data Mining
This paper presents a method for distributed multivariate regression using waveletbased Collective Data Mining (CDM). The method seamlessly blends machine learning and the theory of communication with the statistical methods employed in parametric multivariate regression to provide an effective data mining technique for use in a distributed data and computation environment. The technique is app...
متن کامل